Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Bump OpenBLAS to 0.2.9 and LAPACK to 3.5.0 #7213

Merged
merged 2 commits into from
Jun 12, 2014
Merged

Conversation

ViralBShah
Copy link
Member

I get the following error, which is why I have created the PR for bumping openblas and lapack.
Addresses JuliaLang/LinearAlgebra.jl#116

exception on 5: ERROR: BoundsError()
while loading readdlm.jl, in expression starting on line 4
ERROR: BoundsError()
 in anonymous at ./task.jl:1350
while loading readdlm.jl, in expression starting on line 4
while loading /home/viral/julia/test/runtests.jl, in expression starting on line 46

OpenBLAS 0.2.9 builds fine on OS X and Linux. It would be great if someone can test it on Windows. Cc: @jiahao

@ViralBShah
Copy link
Member Author

A make clean seems to have fixed it, and all tests are passing now.

@ViralBShah
Copy link
Member Author

On linux, the readdlm error is an LLVM 3.4 remnant. Rebuilding with LLVM 3.3.

@staticfloat
Copy link
Member

Works for me on OSX.

@ViralBShah
Copy link
Member Author

I have just removed the 0.2.9rc1 checksums, which were used on Windows. If we want to retain the old checksums, we also need to retain the library patches and other stuff - which just feels like too much maintenance.

@StefanKarpinski
Copy link
Member

Why do we need to retain the patches to retain the checksums?

@ViralBShah
Copy link
Member Author

The old versions will presumably be useless without the patches. Also - why would we want to retain checksums for release candidates? I feel that this PR is fine the way it is currently. It does not delete old checksums of released versions. As something comes up in the future, let's deal with it then.

@StefanKarpinski
Copy link
Member

If someone downloads the wrong version of a dependency, it won't come up – they just won't know. I guess deleting only rc checksums is better but it feels kind of arbitrary.

@ViralBShah
Copy link
Member Author

Would be nice if someone can test this out on Windows.

cc: @quinnj @ihnorton

@andreasnoack
Copy link
Member

cc: @tkelman

@ViralBShah ViralBShah added this to the 0.3 milestone Jun 11, 2014
@tkelman
Copy link
Contributor

tkelman commented Jun 11, 2014

Win32 on Sandy Bridge looks okay, no newly introduced test failures (a few pre-existing unrelated ones still open). I think the most-recently-found Windows AMD bug is still open upstream. Will post again in another couple hours with Win64 results.

@ViralBShah
Copy link
Member Author

Yes the AMD bug is still upstream. I really don't know what to do about that. Shipping with that bug just does not feel right.

@quinnj
Copy link
Member

quinnj commented Jun 11, 2014

Things look good on windows 8.1 64-bit as well.

@tkelman
Copy link
Contributor

tkelman commented Jun 11, 2014

@quinnj how much of the test suite are you running? Do #7082 and #6144 not happen on Windows 8 (unrelated to this PR but I'm curious)?

@quinnj
Copy link
Member

quinnj commented Jun 11, 2014

Yes, I see both #7082 and #6144. I'm also seeing something related to #3399, and #7218 is a new one this morning. But nothing seems OpenBLAS related (famous last words).

@ViralBShah
Copy link
Member Author

@staticfloat We will need to update juliadeps too, or else it is possible that my test for JuliaLang/LinearAlgebra.jl#114 will cause tests to fail. Should I comment out that test and merge this now, or should we hold this PR until juliadeps is updated?

@ViralBShah
Copy link
Member Author

Also - can someone test this on 32-bit linux?

@staticfloat
Copy link
Member

Let's merge now. I'll update juliadeps today.

@staticfloat
Copy link
Member

Errr, let's merge after we've done all the testing we need. @nalimilan could you perhaps test this on one of your 32-bit linux boxes?

@ViralBShah ViralBShah changed the title Bump OpenBLAS to 0.2.9 and LAPACK to 3.5.0 WIP: Bump OpenBLAS to 0.2.9 and LAPACK to 3.5.0 Jun 11, 2014
@StefanKarpinski
Copy link
Member

For OpenBLAS, I know there were some outstanding bugs that 0.2.9 fixed, but do we really need to bump LAPACK? While trying to get to a release candidate isn't really the time for upgrading dependencies unless those upgrades fix known issues that are blocking the release...

@andreasnoack
Copy link
Member

OpenBLAS 0.2.9 upgrades to LAPACK 3.5.0, which I think is the reason for us to do the same. Most users will use the LAPACK from OpenBLAS anyway.

@StefanKarpinski
Copy link
Member

Ah, ok. That makes sense then.

Remove LAPACK 3.4.2 patches (for slasd4 and dlasd4). Fixed in 3.5.0.
Remove openblas 0.2.9rc1 checksum.
Remove openblas osx10.9 patch, which was required for openblas 0.2.8.
The OpenBLAS gemv bug is fixed now (#6941), and we can call gemv. Add a test.
Remove Windows specific openblas version in Versions.make.
@ViralBShah
Copy link
Member Author

I verified that this works on 32-bit linux passing all tests.

ViralBShah added a commit that referenced this pull request Jun 12, 2014
WIP: Bump OpenBLAS to 0.2.9 and LAPACK to 3.5.0
@ViralBShah ViralBShah merged commit 6dfbe71 into master Jun 12, 2014
@ViralBShah ViralBShah deleted the vs/openblas branch June 12, 2014 17:37
@staticfloat
Copy link
Member

Regarding checksums, I don't think deleting old checksums is a problem, because if a user wants to use an older dependency version, they should be using the Julia code that was developed with that dependency in mind, which means that they should be checking out past git commits, which means they will have the checksums relevant to whatever code base they're working on. With this update, we get rid of workarounds for OpenBLAS 0.2.8, which means we really shouldn't make it easy for users to use OpenBLAS 0.2.8 from this commit forward; otherwise they might get improper matrix multiplies and such.

@tkelman
Copy link
Contributor

tkelman commented Jun 13, 2014

Just a heads-up to anyone who might run into problems, I've found some of the new Haswell kernels in the bumped OpenBLAS version don't compile properly unless you're using a fairly recent GCC. If you run into this and don't personally have a Haswell, you can get away with setting OPENBLAS_DYNAMIC_ARCH=0. If you do have a Haswell, maybe setting OPENBLAS_TARGET_ARCH to something older might work too.

@ViralBShah
Copy link
Member Author

0.2.8 and earlier has some serious bugs, and should be avoided. If 0.2.9 doesn't work for someone, it would be better off to use ATLAS or even the reference BLAS at this point.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

@ViralBShah: See my comment and the following discussion in JuliaLang/LinearAlgebra.jl#11. I know that it is no option to switch to a generic Julia implementation now (and this is not what I am proposing) but in the long run it would be really valuable to have a generic Julia "BLAS" in base and put the "real" BLAS ontop of it with an easy opt-out mechanism (see #5155). I really think it would be a big advantage to not depend so much on a library that has caused quite some issues in the last time on the Julia bug tracker.

Having said that I am not sure how feasible my suggestion is for LAPACK. Can LAPACK work without a BLAS library or is the reference BLAS library more stable?

@ViralBShah
Copy link
Member Author

It would be a fair amount of effort but is certainly possible. The generic cholesky is a step in this direction.

@ViralBShah
Copy link
Member Author

On GCC version and haswell kernels, we could detect the GCC version and restrict the architecture to Nehalem for older compiler versions.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

@ViralBShah: Do you have a feeling how much functionality of LAPACK we use? In the case of BLAS I counted 28 functions which should be straight forward to implement if I haven't missed something fundamental. Do we expose anything from LAPACK or is this just a few specific algorithms like the cholesky decomposition.

@andreasnoack
Copy link
Member

LAPACK provides solvers for linear systems of equations, least squares problems, eigenvalue problems and singular value decompositions. With the generic Cholesky, I'd say we cover most of the necessary functionality for linear equations. For least squares, we need a pivoted QR, which we haven't yet. No solvers for eigenvalues and singular values are implemented yet and would require much work to match in Julia implementations. I have started to work on symmetric eigenproblems, but there is still a long way.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

Thanks Andreas. Looks like LAPACK would be much more work then. And if I understand it correctly LAPACK requires a BLAS to be available so that these two cannot be decoupled.

@tkelman
Copy link
Contributor

tkelman commented Jun 13, 2014

Though if you could export a BLAS-compatible C-callable API to the native Julia implementations for BLAS, you could hypothetically use the reference Fortran LAPACK calling the Julia BLAS routines. You lose the type generality going back and forth like that, but it could be an amusing exercise.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

:-) Sounds like an awesome real life test for the C-callable API export functionality Jeff has shown us.

@ViralBShah
Copy link
Member Author

It is extremely difficult to get a high quality high performant BLAS implementation - but it is worth attempting and get to something that is within 2-4x performance to start with.

@ViralBShah
Copy link
Member Author

Let's have a new issue to discuss the BLAS stuff.

@lindahua
Copy link
Contributor

BLAS is huge efforts to maintain (much less to develop from scratch).

Yes, we see some bugs lately, but it works great overall. Why not just contribute and help them fix the bugs when necessary? This is how open source projects work. It would be much easier to contribute fixes to those bugs than starting a new BLAS implementation afresh.

@StefanKarpinski
Copy link
Member

Reluctance to fix basic critical correctness and performance bugs is the main issue for me: e.g. incorrect matrix products (OpenMathLib/OpenBLAS#380) and ridiculously low threading thresholds (OpenMathLib/OpenBLAS#103). The continual problems on new architectures seems suspicious to me as well since Intel is obsessive about backwards compatibility. Shouldn't running code intended for an older CPU only cause a loss of performance opportunity, not correctness or compilation issues?

@quinnj
Copy link
Member

quinnj commented Jun 13, 2014

There would be other big wins with the option to only use native Julia BLAS; much quicker build times (particularly on windows), smaller total package, easier to potentially compile to other languages (i.e. javascript), easier to statically compile, in addition to Stefan's comments about cross-platform and build issues.
I don't think we need to plan to move away from OpenBLAS at all, but being able to "turn it off" and still have native Julia BLAS routines with all their benefits would be a huge win.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

@ViralBShah Sure JuliaLang/LinearAlgebra.jl#11 is actually the issue of a generic BLAS implementation. However, the reason I have cross-referenced this issue is to show that performance is not everything. All these BLAS related problems should also be taken into account when discussing this. To your statement that it is difficult to implement a high performance BLAS implementation: It is certainly true. But a user that handles lots of small vectors will be surprised how slow this high performance BLAS implementation is. So one cannot say that OpenBlas is faster than the generic Julia BLAS implementation in general.

It all depends on the viewport. In my opinion it is more important to have something bugfree and stable and speed comes second.

@lindahua
Copy link
Contributor

Some of the bugs that we are currently facing seem to only occur to certain architectures. High performance BLAS implementation usually relies on codes that are highly dependent on architectures and system configuration, which makes it much more difficult to test and maintain than generic codes. You have to test it on all kinds of architectures & configurations to ensure its correctness.

Yes, Intel definitely cares about backward compatibility. However, they will deliver new instruction sets every one or two years (look at the history from MMX, SSE to AVX). BLAS is often among the first to test those new instructions to make sure that it can extract every bit of the CPU's capability. This may sometimes lead to subtle bugs on certain architectures.

I just read the threads in OpenBLAS. It seems to me that they are willing to address those bugs, but it probably takes some time to figure out what is happening and under what conditions the bug can be reproduced. The highly architecture-dependent nature complicate things a lot, and hence bugs sometimes take longer than expected to resolve.

Generally, I don't object to the idea of having a generic Julia implementation of basic BLAS functions. These can serve two purposes: (1) used for small vectors/matrices, where the external BLAS may be slower due to large overhead; and (2) used as a fallback when bugs of OpenBLAS (or other external BLAS implementation we choose) come up on certain architectures.

However, what I do strongly object to is to move away from external BLAS altogether. I don't want my codes run 4x - 8x slower simply because certain subtle bugs come up on certain platforms.

@ViralBShah
Copy link
Member Author

Yes, fully agree.

@nalimilan
Copy link
Member

I support @lindahua's opinion too. If Julia claims to offer high performance, it needs to use a good BLAS, even if it needs more tweaking for small matrices. R is using the reference BLAS, and it's painfully slow compared to what you get by manually switching it to OpenBLAS. Ensuring OpenBLAS is both fast and reliable would be a strong argument in favor of Julia.

@tknopp
Copy link
Contributor

tknopp commented Jun 13, 2014

@lindahua: Since nobody has proposed to move away from BLAS altogether there is no disagreement here. My proposal was about decoupling and making configurable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants